A social image recommendation system based on deep reinforcement learning

Today, due to the expansion of the Internet and social networks, people are faced with a vast amount of dynamic information. To mitigate the issue of information overload, recommender systems have become pivotal by analyzing users’ activity histories to discern their interests and preferences. However, most available social image recommender systems utilize a static strategy, meaning they do not adapt to changes in user preferences. To overcome this challenge, our paper introduces a dynamic image recommender system that leverages a deep reinforcement learning (DRL) framework, enriched with a novel set of features including emotion, style, and personality. These features, uncommon in existing systems, are instrumental in crafting a user’s characteristic vector, offering a personalized recommendation experience. Additionally, we overcome the challenge of state representation definition in reinforcement learning by introducing a new state representation. The experimental results show that our proposed method, compared to some related works, significantly improves Recall@k and Precision@k by approximately 7%–10% (for the top 100 images recommended) for personalized image recommendation.


Unfunded studies
Enter: The author(s) received no specific funding for this work.

Introduction
In recent years, with development of the Internet and social networks, impressive research effort has been directed at recommender systems.Because in the face of the huge volume and variety of information, we need a system that can automatically identify and provide the user's interests.In spite of the various advances in recommender systems, existing recommender systems require further improvements to provide more efficient recommendations applicable to a broader range of applications.The exponential increase of images as an important source of information in online services and social networks prompted us to investigate the image-based recommender system of social networks.
Many researches have been done in this field that focuses on handcrafted features to identify user interests [1,2], and many studies try to use deep learning models [3][4][5] for representation and recommendation images.Furthermore, some similarity measures [6] and classification methods [7] have been used for recommendation in image recommender systems.The remarkable point in all this is that the recommendation is considered as a static procedure and assumes that the user's underlying preference keeps unchanged, while the system should be in dynamic interaction with the user because the user's preference is dynamic with respect to time.Therefore, we need a novel recommendation method that considers recommendation a dynamic process.
Today, reinforcement learning is used for its high ability to solve complex problems requiring dynamic modeling and long-term planning.The use of reinforcement learning in recommender systems is not new and has been used in old works.Model-based RL techniques such as POMDP [8] and Q-learning [9] were among the first RL techniques for modeling recommendation methods.Nevertheless, when the number of items to recommend is large, these methods could be more efficient due to time complexity.Later, model-free RL techniques were also used for recommendation into two categories: value-based [10][11][12] and policy-based [12][13][14].
Although using RL for recommender systems is not new, mainly because of scalability problems, traditional RL algorithms were not very practical.With the advent of deep reinforcement learning (DRL), a new trend emerged in this field that enabled the application of RL to recommendation problems with large state and action spaces [15].Some of the studies use reinforcement learning for recommendation.The authors in [16] propose a recommendation framework based on deep reinforcement learning.They model the interactions between the users and recommender systems using an Actor-Critic reinforcement learning scheme and consider both dynamic adaptation and long-term rewards.Xiangyu et al. [12] used deep reinforcement learning to automatically learn optimal recommendation strategies and model the recommendation as a Markov decision process.
Due to operationality of recommender systems based on reinforcement learning for real-world problems, we propose deep Reinforcement Learning to define the sequential interactions between users and the recommender system to learn the optimal strategies from users' feedback automatically.The proposed method consists of two main parts, and there are innovations in each part.In the first part, we propose to extract a set of different features from images that represent the characteristics and preferences of the user.To this end, we use three new components: emotion analysis, personality recognition, and style detection to represent users and improve the social image recommender system.In the second part, we propose a new framework based on actor-critic reinforcement learning and with a state representation module to have a dynamic recommender system.Our contribution in this work can be summarized as follows: 1) Propose an image recommender system in a new deep reinforcement learning framework 2) Introducing a new method for state representation.
3) Introducing three components (style, Emotion, personality) that can be useful in images recommendation.Create user's characteristics vector and investigate its effect on the recommended images to each user.
The continuation of this paper is as follows.In Section 2 related work and background are presented.The proposed methods are introduced in Section 3. In Section 4 experimental details and results are discussed.Finally, we conclude this paper in Section 4 and discuss some future work.

Image recommendation related works
In this section, we divide recommender systems into two categories Non RL or traditional recommendation systems and RL based recommendation systems and review related works.Recommender systems are one of the topics of researchers' attention, which are used in different fields [17] [18].Traditional recommender systems can be divided into three categories collaborative filtering, content-based filtering, and hybrid [18].Collaborative filtering recommendation is based on rates given to the items by the users and using rates to find similar users.Content-based filtering is based on content features of items to find similarities between items.Hybrid methods use the capabilities of both methods.By increasing number of images on the Internet and social networks, image recommender systems became an important issue and have received attention.Therefore, we focus on image-based recommendations, especially social images.Some studies have investigated feature extraction methods from images to understand user interests.Lovato et al. [1] considered the images tagged as favorites by a user and extracted handcrafted features from them such as HSV statistics, Use of light, and etc. Guntuku et al. [7], in addition to low-level features [1], extracted a set of high-level features such as Head and Upperbody recognition, Visual Clutter, tag, etc.They proposed a deep bimodal knowledge representation model that increased efficiency by using visual and tag features.The method presented in [19] extracts various features, applies feature selection to take the important features, and then uses fuzzy inference system for image recommendation.In [20], style is used in the recommendation and it shows that the use of style improves the result.In some previous researches, image-level features, deep methods, and social information have been noticed to determine user preferences [21][22][23].However, defining a new component related to user preferences can be useful.
Most traditional recommendation methods involve user-item interaction modeling with supervised learning such as classification, memory-based content filtering from user history, etc.These methods ignore the dependence during successive time steps.Unlike the supervised learning setting, where a guide tells you the right action, to better reflect the user-system interaction, it is widely agreed that the formulation of the problem as a sequential decision problem can be better [15].Therefore, it can be solved by reinforcement learning relies on the environment to discover the right action.In reinforcement learning, the learning process is through interaction with an environment.Shani et al. [8] use of Markov decision processes (MDPs) for recommender systems to consider the long-term effect of recommendations and the expected value of each recommendation.Taghipour et al. [9] Use of Q-Learning to model web page recommendation.However, these model-based RL techniques [9] are inapplicable when there are many candidate items for recommendation, because updating the model requires a time-consuming dynamic programming step.Therefore, model-free RL techniques were preferred for use in recommender systems.These techniques can be divided into two categories, value-based [10,11] and policy-based [13,14,19].
In value-based approaches, for a given state, one must calculate the Q values of all available actions, then the action that has the maximum Q-value is selected as the best action.Therefore, when we face a very large action space,, the approaches may become very inefficient.
The policy-based approaches generate a continuous parameter vector to represent an action [13,14,19].This vector can generate the recommendation and update the Q-value evaluator; so can overcome the inefficiency drawbacks.
As deep neural networks developed, so did deep reinforcement learning.Deep RL techniques have recently attracted recommender systems attention because they enable RL use in problems with large state and action spaces.Huang et al. [24] considered the recommendation process as a Markov decision process and proposed a top-N model based on deep reinforcement learning for long-term prediction wherein recurrent neural network is used to simulate the interactions between the recommender system and user.Zheng et al. [10] applied deep Q-learning in news recommendations to effectively model the dynamic news features and user preferences.Liu et al. [16] used actor-critic reinforcement learning to model interactions between users and recommender systems to have dynamic adaptation and long-term rewards.Zhao et al. [12] proposed List-wise recommendations based on Deep Reinforcement Learning to automatically learn the optimal recommendation strategies.In another study, Zhao et al. [25] used Deep Reinforcement Learning for page-wise recommendation to automatically learn the optimal recommendation strategies and optimizes a page of items simultaneously.

Proposed Method
Figure 1 shows the workflow of the proposed method that consists of two main parts, feature extraction and deep reinforcement learning to learn the recommendation process.For feature extraction, we have proposed the use of three new components consisting of emotion analysis, personality recognition, and style detection to improve the social image recommender system.In the second part, we proposed a new framework based on actor-critic reinforcement learning to have a dynamic recommender system.In the following, we will first describe the methods used to feature extraction, and then we will present the proposed recommender system based on deep reinforcement learning and the extracted features.

3-1) Feature Extraction
In the proposed method framework of the recommender system, there are four components of feature extraction (Figure 1).Visual features, Style feature extractor, Emotion feature extractor, and personality feature extractor (These features listed in Table 1).The model designed and used for each is described below.For each image, we have extracted an emotion vector that expresses the emotions in the image, a style vector that expresses the most prominent styles in the image, and a personality vector that expresses the prominent features of the personality in the image.We interpret this idea in such a way that the user tags an image as his favorite that matches his interests in terms of emotion, personality, and style.In order to extract these three different features, we must first train a model for each one, and then we can extract the corresponding feature vector by giving an image to each of these models.After extracting the desired components, we use reinforcement learning and propose a new social image recommender system.

A) Emotion feature extractor
Understanding emotions in images has attracted many interests due to its various applications.Many researchers inspired by psychology and artistic principles, investigated various features extracted from images to automatically obtain a single emotion for each image [26,27].In the past few years, with the popularity of convolutional neural networks (CNNs), researchers [28,29] have used CNNs to recognize image sentiments and demonstrate that deep features perform better than hand-tuned features.Recent algorithms in convolutional neural networks significantly improve emotion classification, which aims to detect differences between emotion categories and assign a predominant label to each image.Some studies have been done on increasing the depth of neural networks, and in some cases, it has been shown that as the depth of the network increases, the ability of the network to extract higherlevel features improves.EfficientNet neural network [30] states that CNN models should be scaled meaningfully to achieve better accuracy and efficiency.[30] states that carefully balancing the network's depth, width, and resolution can improve performance.Based on this, a new scaling method has been proposed that scales all the dimensions of depth/width/resolution uniformly using a simple, very effective combination factor.
In the present paper, utilizing of the new deep neural network, EfficientNet, has been investigated, which is expected to lead to better results.In this method, instead of expanding the network in only one of the cases of depth, width, and resolution, this expansion is done in a combined manner to have the highest efficiency with a smaller number of parameters.We have suggested using the fine-tuned EfficientNet network to analyze the emotion of the images.

B) Personality feature extractor
Personality analysis can be one of the most important methods for distinguishing between users in their preferences and behavior; since far, many research investigated personality analysis [31,32].In this research, we have examined personality analysis as one of the influential components in the social image recommender system.For this purpose, to predict personality traits we use pre-trained CNNs for image classification.These networks use a large set of images for training, and the intermediate layers depict the semantics of the general visual appearance.We use the power of these networks and fine-tuned them for our problem to learn visual representations correlated with personality traits.
We have taken an approach similar to what was implemented in [33].This study models the personality traits of users based on the Five Factor Theory.Therefore, five distinct binary classifications are considered for each trait.We have considered CNN networks, trained on the ImageNet, fine-tuning the network, and changed the last layer to adapt to binary classifications for each trait.

C) Style feature extractor
Image style plays an important role in how the image looks.Automatic image style recognition is crucial for many applications, including artwork analysis, photo organization, and image retrieval [34][35][36], however, using style in social image recommender systems has not received much attention.This study proposes image style detection as a component to understand user preferences better.For this purpose, we must first obtain the style vector for each image.
Our previous work [20] proposed an image style detection method based on deep correlation features and a compact convolutional transformer (CCT).This method is based on convolution and tries to preserve local information.The idea in this method is that a new convolutional block is proposed instead of the simple convolutional block in CCT.
For this purpose, we use VGG-19 pre-trained convolutional layers, which are trained using the ImageNet dataset as a convolutional block, and fine-tuned part of it during the compression transform learning process.In this paper, we have used the proposed method in [20] to extract the style features of images.

D) Visual Feature
Visual features in Table 1 are part of the features used in [1].

3.2) Recommendation based on Reinforcement Learning
We model our proposed method in the reinforcement learning setting.An agent is a recommendation system in our context that interacts with the environment (users) and receives rewards from the environment (feedback from users).
Rewards serve as an index of whether the course of action the agent is taking is right or wrong.The agent is constantly in contact with the environment and eventually learns to take the right action through the feedback received from the environment over a period of time.The underlying reinforcement learning model is the Markov Decision Process (MDP) includes a sequence of states, actions, and rewards.MDP is determined by five components as follows: State space S: The agent's situation at any moment and the environment has created it.In our work, the status s indicates the user's positive interaction history with the recommender, and we have proposed a state representation module to represent it (section 3-2-s1).
Action space A: An action a ∈ A includes possible reactions that the agent may show when facing the current state.
It can be a list of items recommended to user at current state.
Reward R: Immediate feedback sent to the agent by the environment after evaluating each action.
Transition probability P: the probability of state transition from   to  +1 when agent performs action   .
Discount factor γ: It is a factor that measures the present value of long-term rewards.Its value is between 0 and 1.If γ = 0, the recommender ignores long-term rewards and considers only immediate ones.But, when γ = 1, the recommender considers long-term and immediate rewards equally important.
As aforementioned most previous work has focused only on recommendations as a static process and using pre-trained models for recommendations, and these cannot simulate the interaction process between users and their systems.
Therefore, we propose a new social images recommender system in the deep reinforcement learning framework.To this end, deep deterministic policy gradient (DDPG) Algorithm has been used to provide recommendations.DDPG is an actor-critic technique that combines both Q-learning and Policy gradients.This method consists of two models, the Actor network and the Critic network.
The actor or policy network takes the state as input and generates an action for a given user based on her state.The user's state vector is determined at each step using the state representation module.This module obtains the state of the user based on the history of images that have been liked by the user.(The proposed module for state representation is described in Section 3-2-1.The actor-network consists of two ReLU layers and one Tanh layer, and we have used a skip connection in the penultimate layer.The user state is given as input to the actor network to generate the action vector.The generated action vector is used to provide recommendations to the user.Given that there are many images to recommend to the user, we use the scoring function based on the inner product defined as follows to determine the recommended image.
Where a is the output of Actor and   expression is the vector representation of item i.After calculating the scores, the images are ranked based on the scores, and the image with the top rank is recommended.

The critic network:
The Critic is a Deep Q-Network designed as a deep neural network parameterized (, ).The generated action vector and the user's state are given as input to the critic network to determine how good the generated action is.According to (, ), the Actor network parameters are updated in order to improve the performance of action a, that is, to enhance (, ).
Given that there are many images to recommend to the user and we have to determine the recommended image with the help of the Action vector, in order to help the recommendation process, we have proposed considering a memory of the recommendation history.This memory keeps all the images recommended to the user, both those that are liked by the user and those that are not.We apply the Lasso algorithm to obtain the vector of Lasso coefficients on the information stored in this memory.Then use the dot product between the obtained vector of coefficients () and all images those are candidate for recommendation, to select a list of images, that are better candidate for recommendation.
After calculate   for all   , we consider that those who generate high scores are better candidates for recommendation, and use the dot product between the list of selected images and the action vector obtained from the actor network to create recommendations using equation ( 1).
The reward function is defined as eq.2.When an image is recommended to the user if that image is liked by the user, a positive reward is returned to the system, and if it is not liked, a negative reward is returned to the system.

3-2-1) state representation module
State representation in reinforcement learning is a challenging problem.Therefore, we have proposed a new method for state representation.To this end, we have used the self-attention [37] idea to represent the user's state.In such a way that at every step, when an image is recommended to the user, if the user likes it, we get the self-attention of the recommended images with other images in the collection that the user likes.Equation 3 shows how to update the user state.We display the current status with   , the next state with   , and PositiveState defines a set of recommended items in previous steps that are liked by the user.
Where   define recommended item in this time.
Figure 5 shows the proposed approach to determine the next state of the user.Let  2 be the image that the user recently liked.We assume { 1 ,  3 ,  4 } is the user's favorite collection of images.The following method is used to calculate the next state of the user (s).
Self-attention is the central mechanism of the Transformer.The purpose of Self-Attention is to measure the dependencies of the components in a sequence with each other in order to have a more accurate perception of the whole sequence.In our work, it is assumed that we have a sequence of images liked by the user, and the x2 that was recently liked by the user enters the state representation module.To produce the expected output of the entered image (next state of the user), similar to self-attention, a weighted average on the inputs (here, the image that was recently liked was entered, and the liked images from the previous steps are considered as inputs) is calculated.
To generate the output vector s, we take the weighted average operation on all available input vectors: Where j indexes over the whole sequence and the sum of the weights to one over all j, the weight   is derived from a function over   and   .The simplest option for this function is the dot product: (6) Note that   is the input vector recently liked, and   is the image previously liked by the user.
The dot product gives us a value between negative infinity and positive infinity, so we use a softmax to map the values to [0,1] and make sure they sum to 1 over the entire sequence: In this study, we used four datasets.1) Flickr Style [38] for recognizing image style.The original dataset consisted of 80,000 images with style labels and was classified into 20 labels.We could not collect all the images because some were unlinked from Flickr.In this study, 15 style labels are used, and we have 2900 images for each style.Fig. 3 shows seven sample images from this dataset with serene, melancholy, ethereal, noir, vintage, romantic, and horror styles.
2) The data set introduced in [28] was used to detect the emotion of the image.This dataset contains images of eight emotions defined in Table 2, including amusement, anger, awe, contentment, disgust, excitement, fear, and sadness.
Due to the loss of some images, the data set used in this article contains fewer images than the original data set.Table 2 shows eight emotions and the number of images of each emotion.Eight sample images with their emotion are shown in Fig. 4.
3) PsychoFlickr dataset is used for personality analysis [33].A collection of 60,000 images tagged as favorites by 300 Pro Flickr users (200 randomly selected favorites per user).
4) To evaluate image recommendation, we employ part of the dataset used in [1].We consider a collection of 4000 images belonging to 20 users from Flickr.For each user, there are 200 images tagged as favorites.Random samples of images tagged as favorites by users are shown in Fig. 5.
The first three datasets are respectively used to train the three methods of style extraction, emotion extraction, and personality analysis, and the fourth dataset is used to check the performance of the proposed recommender system.

Implementation results of feature extraction methods
We must first extract the mentioned features to provide a recommendation system based on reinforcement learning.
We need to train models to extract emotion, personality, and style characteristics.In the following the implemented models and their results for extracting these components have been stated briefly.
Emotion.We first used a system to recognize the emotion of each image.We have built our framework based on the EfficientNetB1 model.First, the network is set up with the trained weights for the image classification task using the ImageNet dataset, and then we fine-tune the network for our problem.Since the class number of the effective dataset is not the same as ImageNet, the last fully connected layer is changed to the required number of classes of our target dataset, which can generate a probability distribution on the emotional labels.Due to system limitations, the size of the images has been resized to 200×200.As shown in Table 2, the dataset used for the emotion recognition model includes 21,000 images, of which 80% are used for training, 15% for testing, and 5% for validation.Table 3 reports the results obtained from the four methods implemented to recognize emotions.The results show that the FinetuneFMEfficientNetB1 (Imagenet), the Finetune of the EfficientNetB1 network leads to better results.Therefore, we improved the recommender system based on emotion analysis based on this model.

Personality.
For personality feature extraction, we have taken an approach similar to what was implemented in [33].
We attempts to model the personality traits of users based on the Five Factor Theory.Therefore, five distinct binary classifications are considered for each trait.For each trait, the range of values is divided into three parts: Values that are below the first quartile, low set.Values above the third quartile are the upper set and the middle set.For the binary classification problem, to further distinguish between the two classes, we only select users with values in the low and high sets.
We have considered two CNNs networks, VGG16 and VGG19, trained on the ImageNet and changed the last layer to adapt to the personality recognition problem.These trained networks are ideal candidates for fine-tuning because they are trained on many images (1.2 million) and many classes (1000 object categories), providing very strong representational power.The results of implantation have been shown in Table 4.In our experiment, the size of the image has been considered 160×160.We investigate attributed traits, because in recommender system we only have image like by user and we cannot find any information about their personality.The dimension of the image embedding vector determines based on features extracted and is 107, and we set the discounted factor γ = 0. 5.
We calculate the recall@K and Precision@K on all users using Eqs.( 2) and (3).Fig 6 and 7 show the results, we can see that the proposed method outperformed baselines, demonstrating our proposed method's effectiveness.Recall@K is used to calculate the proportion of relevant items in the top-K.Precision@K calculates the proportion of relevant top-K items (liked by the user).
The 15 top recommended images for one of the users with different methods are shown in Fig 8.In this presentation, we have sorted the test images based on the score obtained from different methods.Then we identified 15 images with the highest score as top K recommended image by each method.15 top image recommended to the user using proposed method can be seen in Fig 8 .f.As you can see, only two of the 15 images recommended to the user were not part of the user's favorite collection.While in other methods, a larger number of recommended images are not part of the user's favorite collection, which shows the superiority of the proposed method over other methods.In fact, we have considered user preferences from different aspects and analyzed more features of the user; applying different components, emotion, style, and personality, to recognize user preferences.On the other hand, we have used an interactive method with the user, and the system is updated according to the user's preferences, so we got better results.

CONCLUSION
In this paper, we propose a new deep reinforcement learning social image recommender system to learn the optimal recommendation strategies automatically and provide a system that can directly interact with the user and get feedback to improve and update the system strategies continuously.Also, RL base recommendations can learn a strategy that maximizes the long-term cumulative reward from users.To represent the user's state, we have proposed the state representation module, which updates the user's state in each step using the images liked by the user in the previous steps.in addition, we propose using a different collection of features, Emotion, style, and personality, to recognize the user's preferences.The results show the superiority of the proposed method over other methods.Using other behavioral features related to user behavior is possible in future works.Introduce methods that can give more accuracy in detecting emotion, style, and personality can be investigated in the future, and as a result, the recommender system is expected to improve.
Funded studies Enter a statement with the following details: Initials of the authors who received each award • Grant numbers awarded to each author • The full name of each funder • URL of each funder website • Did the sponsors or funders play any role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript?• NO -Include this sentence at the end of your statement: The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.• YES -Specify the role(s) played.• * typeset Competing Interests Use the instructions below to enter a competing interest statement for this submission.On behalf of all authors, disclose any competing interests that could be perceived to bias this work-acknowledging all financial support and any other relevant financial or nonfinancial competing interests.This statement is required for submission and will appear in the published article if the submission is accepted.Please make sure it is accurate and that any funding sources listed in your Funding Information later in the submission form are also declared in your Financial Disclosure statement.View published research articles from PLOS ONE for specific examples.

Format
for specific study types Human Subject Research (involving human participants and/or tissue) Give the name of the institutional review board or ethics committee that approved the study • Include the approval number and/or a statement indicating approval of this research • Indicate the form of consent obtained (written/oral) or the reason that consent was not obtained (e.g. the data were analyzed anonymously) • Animal Research (involving vertebrate animals, embryos or tissues) Provide the name of the Institutional Animal Care and Use Committee (IACUC) or other relevant ethics board that reviewed the study protocol, and indicate whether they approved this research or granted a formal waiver of ethical approval • Include an approval number if one was obtained • If the study involved non-human primates, add additional details about animal welfare and steps taken to ameliorate suffering • If anesthesia, euthanasia, or any kind of animal sacrifice is part of the study, include briefly which substances and/or methods were applied • Field Research Include the following details if this study involves the collection of plant, animal, or other materials from a natural setting: Field permit number • Name of the institution or relevant body that granted permission • Data Availability Authors are required to make all data underlying the findings described fully available, without restriction, and from the time of publication.PLOS allows rare exceptions to address legal and ethical concerns.See the PLOS Data Policy and FAQ for detailed information.Yes -all data are fully available without restriction Powered by Editorial Manager® and ProduXion Manager® from Aries Systems Corporation A Data Availability Statement describing where the data can be found is required at submission.Your answers to this question constitute the Data Availability Statement and will be published in the article, if accepted.Important: Stating 'data available on request from the author' is not sufficient.If your data are only available upon request, select 'No' for the first question and explain your exceptional situation in the text box.Do the authors confirm that all data underlying the findings described in their manuscript are fully available without restriction?Describe where the data may be found in full sentences.If you are copying our sample text, replace any instances of XXX with the appropriate details.If the data are held or will be held in a public repository, include URLs, accession numbers or DOIs.If this information will only be available after acceptance, indicate this by ticking the box below.For example: All XXX files are available from the XXX database (accession number(s) XXX, XXX.).• If the data are all contained within the manuscript and/or Supporting Information files, enter the following: All relevant data are within the manuscript and its Supporting Information files.• If neither of these applies but you are able to provide details of access elsewhere, with or without limitations, please do so.For example: Data cannot be shared publicly because of [XXX].Data are available from the XXX Institutional Data Access / Ethics Committee (contact via XXX) for researchers who meet the criteria for access to confidential data.The data underlying the results presented in the study are available from (include the name of the third party • The data set underlying the results presented in the study are available from references 1, 28, 33, and 38 that are referred to in the article.and contact information or URL).This text is appropriate if the data are owned by a third party and authors do not have permission to share the data.• * typeset Additional data availability information: Powered by Editorial Manager® and ProduXion Manager® from Aries Systems Corporation

Fig. 1
Fig. 1 Workflow of the proposed method

Fig. 3
Fig. 3 Random sample images from the Flickr datasets.From left to right, the styles are noir, romantic, horror, and vintage in the first row.From left to right, the styles are ethereal, melancholy, and serene in the second row.

Fig. 4 .Fig. 5 .
Fig. 4. Sample images of eight different categories of emotions.Top row: four positive emotions and bottom row: four negative emotions.

Fig. 8 (
Fig. 8 (a) some samples of training images for user2.(b)-(f) 15 top images recommended to user2 by different methods.Incorrect recommendations are highlighted in red.

Figure
Figure